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DETECTING LOCAL NETWORK MOTIFS 

By Etienne Birmele*, 
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Studying the topology of so-called real networks, that is networks 
^J , obtained from sociological or biological data for instance, has become 

CN ■ a major field of interest in the last decade. One way to deal with it 

is to consider that networks are built from small functional units 
called motifs, which can be found by looking for small subgraphs 
whose numbers of occurrences in the whole network are surprisingly 
00 | high. In this article, we propose to define motifs through a local over- 

representation in the network and develop a statistic to detect them 
^ ' without relying on simulations. We then illustrate the performance of 

our procedure on simulated and real data, recovering already known 
biologically relevant motifs. Moreover, we explain how our method 
gives some information about the respective roles of the vertices in a 
motif. 
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1. Introduction. One way to reach a better understanding of the struc- 
** \ ture of networks is to summarize part of the information in the counts of 

small subgraphs. That method is used for decades in social network studies, 
for example via the triad censuses (Watts and Strogatz, 1998). More recent 
work indicates that biological networks show recurrent small patterns, called 
network motifs and introduced by Milo et al. (2002). They can be thought 
of as small units of given function from which the networks are built. For in- 
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stance, Alon (2007) describes the regulation role in transcriptional networks 
of a pattern of three vertices called the feed- forward loop. It is therefore 
quite natural to ask which are the patterns that are over-represented in a 
given network. 

Looking for over-representation requires a null model to compare the ob- 
served network with. The most popular model is the stub-rewiring model 
introduced by Milo et al. (2002), which is used in several methods for motif 
detection including those of Kashani et al. (2009); Kashtan et al. (2004); 
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2 E. BIRMELE 

Wernicke and Rasche (2006), the method based on graph alignments of 
Berg and Lassig (2004) and the method devoted to labeled graphs of Banks et al. 
(2008). It is a model requiring the generation of a large number of graphs 
whose nodes have the same in and out-degrees as the observed network. 
However, Artzy-Randrup et al. (2004) point out that this method does not 
take into account the preferential links between some vertices and the high 
local density, which are two major features of biological networks. They also 
show that the use of the stub-rewiring model and of a Z-score to detect 
motifs may lead to false positives. 

Another way to define the null model is to consider a random graph model 
defined by a probability distribution. Litter ature about random graph mod- 
els and their suitability to real networks is abundant (see e.g. Chung and Lu, 
2006). Among the existing models, mixture models play an important role 
as they allow different link probabilities between classes of vertices and thus 
model the heterogeneity of connection patterns. Moreover, mean and vari- 
ance calculation for the pattern counts are tractable under such models, as 
shown by Picard et al. (2008). 

It is important that motifs are defined conditionally on subpatterns oc- 
currences, as pointed out by Milo et al.. Indeed, a pattern may appear as 
over-represented because it contains an over-represented subpattern, which 
is in fact the biologically relevant structure. This conditioning issue is also 
taken into account by Banks et al. in the context of labeled graph. Never- 
theless, in both cases, the real network is compared to graphs generated by 
the stub-rewiring procedure. Therefore, to study the patterns of size k, it 
is necessary to generate a large number of graphs with the same number of 
each type of subgraphs of size ranging between 2 and k — \ as in the observed 
network. In practice, only the cases k = 3 and k = 4 are implemented to 
our knowledge (Milo et al, 2002). 

Finally, Dobrin et al. (2004) show that the motifs found in the Yeast tran- 
scriptional regulatory network aggregate, that is they highly concentrate in 
some regions of the network, indicating that biologically meaningful mech- 
anisms may not be spread uniformly in the network. Therefore, it seems 
natural to look for local over-representation of patterns. 

That phenomenon is also highlighted in the more biologically driven work 
by Zhang et al. (2005), who suggest to look for motif themes rather than 
motifs. They define themes as recurring higher- order interconnection pat- 
terns that encompass multiple occurrences of network motifs and show their 
biological relevance in the different networks associated to Yeast. In other 
words, themes are patterns corresponding to several occurrences of a motif 
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DETECTING LOCAL NETWORK MOTIFS 3 

that share some of their nodes. 

The major contribution of this paper is to propose a definition of a local 
motif based on the themes of Zhang et al, and a procedure to detect the 
local motifs of fixed size A; in a network. This procedure builds upon earlier 
works in motif detection, being to our knowledge the first approach taking 
into account the conditioning on a subpattern without size limitation on the 
considered pattern, as well as the local character of patterns. Moreover, it 
makes no assumption on the law of the pattern counts. It is composed of 
four main steps: 

• Inference of the parameters of the null model. We consider a model 
in which each node belongs to a fixed class and each edge is drawn 
independently from the others under a Bernoulli law whose parameter 
depends only on the classes of its endvertices. 

• Enumeration and localization of all patterns of size k present in the 
studied network and of their subpatterns. 

• Assignment of a p-value to each pair (pattern; subpattern) present in 
the network for testing local over-representation. The key idea of that 
assignment is to show that the distribution of the number of local 
occurrences of a pattern is close to a Poisson distribution, allowing us 
to bound the exact p- value. 

• A filtering procedure which ensures that every emergent local motif 
conveys some novel information about the network structure when 
compared to its subpatterns. 

We define precisely the notion of local over-representation and detail the 
four steps of our procedure in Section 2. As the obtained p- value is in fact an 
upper bound of the exact one, we investigate the tightness of that bound in 
Section 3. We then show results on both simulated and real data in Section 4. 

2. Methods. 

2.1. Local network motifs. We consider a network G of interest on n 
vertices. In this work, we consider directed graphs, with possible loops and 
opposite edges, but all the results can easily be extended to undirected 
graphs. 

A pattern m of size /c is a directed graph on k vertices, from which we 
want to know if it is locally over-represented in G. As a convention, we will 
denote by (a, b, ... ) the vertices of m and by (u, v, . . .) those of G. 
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4 E. BIRMELE 

An automorphism of m is a permutation <j) of its vertices such that, for 
every pair (a, b) of vertices, <p(a)<p(b) is an edge if and only if ah is an edge. 
Let 1Z be the relation defined by alZb if b is the image of a by an automor- 
phism. 1Z is an equivalence relation on the vertices of m and those vertices 
can therefore be partitioned into equivalence classes, which we call deletion 
classes. For example, in the bi-fan pattern shown in Figure 1, the permuta- 
tion exchanging a with b and c with d is an isomorphism. However, a and 
c are not equivalent as they have different outdegrees. Thus the bi-fan has 
two deletion classes which are {a,b} and {c,d}. 

Let (Ci, . . . , Ck) be the deletion classes of m and (ii, . . . , Ik) their re- 
spective sizes. A position U in a network G for m will then denote a list 
(Vi, . . . , Vk) of disjoint sets of vertices of G with respective sizes (z'x, ... ,1k)- 
That position is an occurrence of m in G if the subgraph of G induced by 
the vertices of U is isomorphic to m. Writing a position as a list of sets of 
vertices ensures to count every occurrence of a pattern only once. However, 
for clarity, we will write positions as lists of vertices throughout the article. 

A subpattern of m denotes a pair (C, m'), where C is a deletion class of 
m and m' the pattern on k — 1 vertices obtained by deleting any vertex 
of C. As all vertices in C are isomorphic, their respective deletion lead to 
isomorphic subgraphs and the notion of subpattern is thus well defined. 

However, the opposite is not true, that is the subgraphs of m obtained 
by deleting a vertex a or a vertex b may be isomorphic while a and b do 
not belong to the same deletion class. Consider for example the feed- forward 
loop shown in Figure 1. Deleting any of its three vertices leads to a single 
edge but all the vertices belong to different deletion classes as they are not 
topologically equivalent (they have for instance different out-degrees). 

In the following, we adopt the graphical convention shown in the last 
column of Figure 1 to draw at the same time a pattern m and one of its 
subpatterns (C, m'). The whole graph represents m and the squared vertex 
whose adjacent edges are dotted is a vertex of C. The pattern m' is thus 
obtained by deleting that vertex. 

Let m be a pattern and (C, m') one of its subpatterns. An occurrence U 
of m in G is an extension of an occurrence U' of m' if the vertex set of U' 
is a subset of the vertex set of U . We define the (m, C)-theme on U' as the 
subgraph of the network induced by the occurrence of m' at U' and all its 
extensions. The number of those extensions will be the order of the theme 
(see Figure 2 for an illustration). 

We define a potential local motif as a pattern which is locally over-represented 
with respect to at least one of its subpatterns. In other words, m is a po- 
tential local motif with respect to (C, m') if there exist an (m, C)-theme 
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DETECTING LOCAL NETWORK MOTIFS 5 

Pattern Subpatterns Synthetic drawing 

Feed-forward loop 



a J- \b 



{a}, 









{&}, 




Bi-fan 



{c}, 




{a,b}, 



{c,d}, 





Fig 1 . The feed-forward loop and bi-fan patterns with the list of their subpatterns. The last 
column shows how we represent a pattern and one of its subpatterns in a single drawing. 



whose order is significantly higher than the expected order in a random 
model to be specified in Section 2.2. Note that the local over-representation 
is different from the global one. Indeed, a pattern having a high number of 
disjoint occurrences may be globally over-represented without being a motif 
according to our definition. On the other hand, a pattern may be locally 
over-represented without having a large number of occurrences in the whole 
network. 

Finally, a potential local motif is a local motif if the information it conveys 
is not redundant with a smaller local motif, that is if it is not filtered out 
by the procedure to be described in subsection 2.5. 
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m_> 



PDR5 



IPT1 




HXT9 



PDR3 



YOR1 




Fig 2. The shown subnetwork of the Yeast regulation network is a (mi, Ci) -theme 
of order 6 at position {{PDR1.PDR3}) and a (1112, C2) -theme of order 5 at position 
({PDR1, PDR3},{PDR5}). C\ and C2 are the respective deletion classes of the squared 
vertices. 



2.2. The random graph model. The random generation model we con- 
sider is based on blockmodels (White, Boorman and Breiger, 1976), with a 
fixed and known class for each vertex and random edges. It depends on a 
4-tuple of parameters (n, Q, Z, II), where: 

• n is the number of vertices, 

• Q is the number of classes of the model, 

• Z € {1, . . . , Q} n is a vector giving the class of each vertex, 

• II is a Q x Q connectivity matrix. The coefficient TL q i £ [0, 1] of that 
matrix indicates how likely a vertex of class q and a vertex of class I 
are linked by an edge. 

Under this model, all the edges of the random graph are drawn indepen- 
dently under Bernoulli laws: denoting by X uv the indicator variable of the 
edge between vertices u and v, 

X uv ~ B(TL Z (u),z(v))- 

Such a model belongs to the family of blockmodels, which are widely used 
to describe real network data (Nowicki and Snijders, 2001). It has at least 
two main advantages. First, it takes into account the preferential attachment 
process between several groups of nodes in the network; second, as all the 
edges are independent, calculations remain tractable. 

Note that it is not the classical random graph model introduced by Erdds 
and Renyi (1959) as the edge probabilities are non uniform (unless Q = 1). 
It is neither a mixture model as the classes of the vertices are not random. 



imsart-aap ver. 2010/04/27 file: Birmele.tex date: July 9, 2010 



DETECTING LOCAL NETWORK MOTIFS 7 

Adding randomness on the classes would induce correlations between the 
edges and therefore invalidate the Poisson approximation of Section 2.3. 

Nevertheless, the vertex classes and the connectivity matrix II may be in- 
ferred using estimation algorithms for graph mixtures (Daudin, Picard and Robin, 
2008; Hofman and Wiggins, 2008; Latouche, Birmele and Ambroise, 2008; 
Nowicki and Snijders, 2001). The graph mixture models, when applied for 
Q components in the mixture, infer for each vertex a vector of length Q 
giving the probability for that vertex to belong to each class, as well as a 
connectivity matrix. We choose here to assign each vertex to its more prob- 
able class, and thus deal with the the observed network as if the groups were 
known and fixed. 

Our general random graph framework also contains another widely used 
model where u and v are linked with probability proportional to the product 
of their respective observed degrees. Under that model, which we will call 
Expected Degree, the expected degree of each vertex is almost equal to its 
observed degree (Matias et al., 2006). However, under this model, nodes are 
in the same class if and only if they have the same in- and outdegrees. 
Therefore, the number of classes may be large on real networks, having a 
deep impact on the running time of our motif detection procedure. 

Overlapping classes can also be taken into account by using for instance 
the Mixed Membership Stochastic Blockmodel from Airoldi et al. (2008) or 
the model Overlapping Stochastic Blockmodel from Latouche, Birmele and Ambroise 
(to appear). 

2.3. Local over-representation. Consider a pattern m of size k, a sub- 
pattern (C, m') of m and a set U of k — 1 vertices in some graph G. If U 
corresponds to an occurrence of m', we write G[U] ~ m'. Let iV[/(m) denote 
the order of the (m, C)-theme located at U. If U does not correspond to an 
occurrence of m', we set iV[/(m) = 0. 

We define A[/(m) = ¥.(Nu (m)\G[U] ~ m') and A[/(m) as the normalized 

quantity 

_ Nu(m) - Xujm) 

A[/(mJ . 

A[/(m) 

Looking for themes whose order is much larger than expected under our 
model is then equivalent to look for values of Afy(m) significantly larger 
than 1. 

In the following, we will omit the reference to m when there is no ambi- 
guity. 

Let us consider a set U corresponding to an occurrence of m'. For each 
vertex v ^ U, we denote by I]j the indicator random variable which is equal 
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8 E. BIRMELE 

to 1 if adding v to U yields an occurrence of m in G. Let pfj be the mean 
value of I v . Then Xu = YlviuPu anc ^ those quantities can easily be deduced 
from the parameters of the random graph model. 

As the indicator random variables {Ijj) v du are independent, it is well 
known that the law of their sum, that is the law of Njj, can be approxi- 
mated by a Poisson law (Barbour, Hoist and Janson, 1992). More precisely, 
denoting by cItv the total variation distance between two distributions, we 
have 

dTv(C(N v ),C(Po(X v ))) < min(l, A^ 1 ) J>£) 2 , 

where C(Po(Xu)) is the Poisson distribution with parameter Xu- 
This approximation may be used to determine an upper bound for the 
p- value of testing if the (m, C)-theme order is surprisingly large. In practice, 
such bounds are quite accurate as the pjj's are small. 

Nevertheless, a better approximation can be obtained for the tail probabil- 
ities by using Chen-Stein's method (Chen, 1975), as shown in Barbour, Hoist and Janson. 

(2.1) VK > 2Xu, F(Nu > K\G[U) ~ m') < ^~^ Po(Xu){[K, +oo)), 

where, for any measurable set A, Po(Xjj)(A) is the probability of A under 
the Poisson distribution with parameter Xjj. 

Setting K = |"A{/(l + £)] for some t > 1 and using elementary bounds and 
transformations developed in Appendix A, we obtain 



(2.2) Vt > l,P(A f/ > t) < F(G[U] ~ m 7 ) "^ F + T . e -Xu((i+t)Hi+t)-t) 

\f2irXu{t — 1) 

For positive values of t which are smaller than or close to 1, a sharper 
bound can be obtained by using a concentration inequality on the sum of 
independent random variables bounded between and 1 (McDiarmid, 1998). 



(2.3) Vt > 0, P(Aff > t) < F(G[U] ~ m')e 



-Xu((l+t)ln(l+t)-t) 



Moreover, it is straightforward to verify that the function of t defined on 
]1, +oo[ by /J" is decreasing and is equal to 1 at 

V 2,7r\{t- — 1 J 

l 



t x = l + — (1 + V1 + 167TA). 
47TA 

Therefore, coupling inequalities (2.2) and (2.3) yields a local bound for 
the tail probability of the theme order. 
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DETECTING LOCAL NETWORK MOTIFS 9 

Theorem 2.1. For any pattern m ; subpattern (C, m') and position U 
and for every positive t, let 



h(\,t) 
Then, Vt > 0, 



1 if t < t x , 

^ if t>t x . 



/27rA(t-l) 



F(A V > t) < P(G[J7] ~ m / )/i(A[/,i)e 



-At/((l+i)ln(l+t)-t) 



We thus have an exponentially decreasing local bound for the tail prob- 
ability of the centered and renormalized order of an (m, C)-theme on U. 
Moreover, that bound is easily computable from the parameters of the ran- 
dom graph model. 

2.4. A global statistic to detect local motifs. Theorem 2.1 allows to test 
whether there is a local over-representation at a given position U of an 
(m, C)-theme. However, the number of possible positions U is growing as 
n . We thus encounter a multiple testing problem. To overcome this issue, 
we build a statistic characterizing any local over-representation of a pattern 
somewhere in the graph. 

Let us consider the function g defined for every positive A and t by 

(2.4) g(X, t) = A((l + t) log(l + t)-t)- log(/ l (, A, £)). 

For any positive A, the function g(X, .) is a one-to-one increasing function, 
mapping ]0, +oo[ to itself and which is equivalent to Ailog(i) as t tends to 
infinity. Thus, the event g(Xjj,Ajj) much larger than 1 is equivalent to the 
event Ajj much larger than 1. 

For any positive t, let us apply Theorem 2.1 to y such that g(Xjj,y) = t. 
We then obtain 

Vt > 0, F(g(X u ,A u ) > t) < F(G[U] ~ m>-*. 

Noting that the event E t = {maxjj(g(Xjj, Ajj)) > t} is the union over all 
the possible positions U of the events E\j = {g(Xu, Ajj) > t}, and that the 
exponential term in the upper bound is independent of U, we obtain our 
main result, stated in the following theorem. 

Theorem 2.2. Let g be the function defined in Equation (2.4) and 
N(m') the random variable denoting the global number of occurrences of 
m' in G. Then, for every t > 0, 

(2.5) P( m&x(g(Xu, A v )) > t) < EiY(m')e~* 
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10 E. BIRMELE 

We thus obtain an upper bound on the global p- value for detecting a local 
over-representation of m with respect to the subpattern (C, m') occurring 
anywhere in the network. 

2.5. Motif selection criterion. Consider the two patterns and respective 
subpatterns of Figure 2. Let C\ and C2 denote the respective deletion classes 
of the subpatterns. Then, as shown by the figure, every (m2,C2)-theme of 
order K is an (mi, Ci)-theme of order K + 1. In that case, the fact that the 
(rri2, C2)-theme is of order significantly larger than expected is redundant 
with the same information for the (mi, Ci)-theme. 

To avoid such redundance in the final motif list, a pattern m will be con- 
sidered as a motif with respect to a subpattern (C, m') if the two following 
conditions hold: 

1. The p- value given by Theorem 2.2 is lower than a fixed threshold, that 
is m is a potential local motif; 

2. Let {a} be a vertex of C. There exist no set A of vertices of m such 
that 

• there is no edge between a and any vertex of A, 

• m\A is over-represented with respect to (D, m\ (.Au{a}), where 
D is the deletion class of {a} in m \ A . 

If there exists a set A satisfying the two points of the second condition, 
then the over-representation of m with respect to (C, m\ {a}) is considered 
as redundant with the over-representation of m \ A with respect to (D, m \ 
(A U {a}))- Thus, the pair (m, C) is filtered out from the local motif list. 

Figure 3 illustrates the filtering procedure. Moreover, it shows why the 
absence of any edge between vertex a and set A is required to filter out 
a potential local motif. Indeed, consider the first pattern of the figure for 
which A = {b} fulfills the previous condition. The corresponding theme 
doesn't convey any new information compared to the theme of the feed- 
forward loop obtained by deleting the vertex w. 

On the opposite, a theme of order k of the second pattern contains k 
supplementary edges compared to the feed-forward loop theme. The coeffi- 
cients of II being small in general because of the sparsity of real networks, 
the presence of those k edges is informative. Thus, that potential local motif 
is kept in the list of local motifs. 

2.6. Algorithmic issues. Given an integer value k, our procedure first 
needs to list all the patterns of size k occirring in the network, as well 
as their subpatterns. That issue is tackled by using the ESU algorithm of 
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(1) 



i 



-e (I 



(2) 





Fig 3. Illustration of the filtering procedure. (1) Suppose the two patterns shown are po- 
tential local motifs with respect to the deletion of a and that the feed-forward loop was 
previously declared as a motif with respect to the deletion of a. Then the left pattern is 
filtered out by applying the filtering procedure for A. — {b}. However, the second is declared 
as a motif because of the edge between b and a. (2) Themes of order 5 in the network G for 
the two previous potential motifs. In the second case, the edge from b to a in the pattern 
implies the presence of 5 additionnal edges in the theme. 



Wernicke (2005). It is important to note that this subgraph count has only 
to be done on the observed graph and not on a huge number of simulated 
ones. 

We then apply Theorem 2.2 to each pair (pattern, subpattern). The ma- 
jor cost in terms of computational time of that step is the computation of 
E(iV(m / )). Indeed, it requires to sum a probability of occurrence along all 
possible positions in the graph. It may be done more efficiently by group- 
ing the nodes belonging to the same class. This approach gives good results 
when the number of classes is not too large, which is the case in practice 
when estimating the classes using mixture models algorithms. However, it 
becomes a real drawback for motifs larger than 4 vertices when using the 
Expected Degree method and observed graph with more than 200 vertices. 

Finally, as the algorithm visits every position, it is not time-consuming to 
keep in memory all the positions and their respective theme orders in order 
to have a better interpretation of the results. 
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12 E. BIRMELE 

3. Lower bound. The fact that Theorem 2.2 gives an upper-bound of 
the exact p-value ensures that the number of false positives is controlled by 
the threshold used in the procedure. However, the tightness of that bound 
has to be taken into consideration to tackle the problem of false negatives. 

An evaluation of the tightness of the local bound given by Theorem 2.1 can 
be obtained for moderate deviations, as stated in the following proposition. 

Proposition 3.1. Consider any pattern m and subpattern (C, m'). Let 
U be a position corresponding to an occurrence of m' in G and define X2,u = 
^2vdu(Pu^ '• Denote by Bjj(t) the local upper bound on P(A[/ > t) given by 
Theorem 2.1. 

Suppose that X 2 u < \ ■ Then, for every t such that 1 < t < — 7== — 1 , 



F(Ac/ > t) > 



52^(1 ft) 



1 



Bu(t) -l Xu JL 1 + tJL 10Xu(l + t 

The proof relies on results about Poisson approximations for sums of 
independent random variables given in Barbour, Hoist and Janson and is 
detailed in Appendix B.l. 

This theorem shows that for infinite sequences of real numbers t^ n \ graphs 
GW and positions U^ such that t^ and (tXu)^ go to infinity and (tX 2 ,u/Xu) {n) 
goes to 0, the bound on P(A^ > t^ n >) is asymptotically tight. 

For example, let us consider the Erdos-Renyi model with a connection 
probability p( n > = —. That choice corresponds to a linear growth of the 
number of edges (Chung and Lu, 2006). Denote by k and n the respective 
sizes of m and G and by r the number of edges which are in m but not in 
m'. Then, for any position U, 

(3.1) A^ = (n-k + l)p r ~ n ^ +0O -^ I 

n r 1 

2r 

(3-2) Xfjj = (n-k + l)p 2r -^^-^-j 

Thus, for r > 2, choosing t^ n > ~ n a with r — 1 < a < r yields a sequence 
of thresholds for which the bound of Theorem 2.1 is asymptotically tight. 

The tightness of the global bound given by Theorem 2.2 is a more in- 
tricate issue. Using the notations E = {iaax.jj(g(Xi/,Ajj)) > t} and E\j = 
{g(Xu, Afj) > t}, the derivation of a lower bound for the event E t can be 
done using 



¥(E l ) > £>(£&) -^P(^n^). 



u u,v 
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The term ^ /U W(E\j) corresponding to the proposed upper bound, it is suf- 
ficient to derive tight upper bounds on the intersections E^DEy. Neverthe- 
less, for some patterns, the number of extensions at two overlapping positions 
U and V may be strongly correlated. For instance, consider Figure 2 and 
in particular the occurrence of the pattern rri2 at position (PDR1, PDR3, 
PDR5, IPT1). The number of extensions of m' 2 at position U =(PDR1, 
PDR3, PDR5) will be equal to the number of its extensions at position 
V =(PDR1, PDR3, IPT1). Therefore, the probability of the intersection is 
not small with respect to the probabilities of the single events. 

However, this approach allows us to show that the global upper bound 
we propose is tight in the sense that for some patterns and some random 
models corresponding to sparse graphs, it is asymptotically the best one. 

Proposition 3.2. Let m be a pattern of size k admitting some vertex 
a linked to every other vertex of m. Consider its subpattern (C, m') where 
C is the deletion class of a. 

Let p = maxjj IIjj- and suppose that p = 0(n~2~ e ), with e > -^. Let 
5 = min(e, 2ke — 1) > 0. Then 

P(max( 5 (A[/, At/)) > t) = (1-n) J^ V(g(kr, &u) > t), where n = 0(n~ 5 ) 

u 

Note that the condition on p still allows a growth rate of the number of 
edges of the order 0(n2 -e ), which is faster than the linear growth observed 
on real data (Chung and Lu, 2006). The detailed proof of the proposition is 
given in Appendix B.2. 

Combining Propositions 3.1 and 3.2 yields that the proposed upper bound 
is asymptotically tight for some pairs of patterns and subpatterns in a given 
range of model parameters. 

4. Illustration of the procedure. 

4.1. Simulated Data. 500,000 directed graphs with 90 vertices, which 
we will call the reference graphs, were generated under the model with three 
classes of 30 vertices each and connection probabilities set to 0.04 between 
vertices of the same class and 0.01 between vertices of different classes. The 
mean out-degree and in-degree under that model are both equal to 1.76. Our 
method is illustrated with the feed-forward loop an the bi-fan patterns (see 
Figure 1). The choice of the subpatterns is done by deleting the only vertex 
of in-degree 2 for the feed- forward loop and one of the vertices of in-degree 2 
for the bi-fan. Nevertheless, that choice plays no role in that particular case, 
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(a) Empirical p-value and upper bound for the feed-forward loop 





(b) Empirical p-value and upper bound for the bifan 




(c) Ratio between empirical p-value and upper-bound 





Fig 4. Empirical p-values and corresponding upper bound, (a) Both curves for the feed- 
forward loop in the 500,000 reference graphs and zoom, at the distribution tail, (b) Same as 
(a) for the bi-fan pattern, (c) Ratio between the empirical p-value and the upper bound for 
the reference graphs (black circles), the dense graphs (blue squares) and the large graphs 
(red diamonds). 



due to the symmetry of the model. Figure 4 (a) and (b) show the empirical 
tail probabilities and corresponding upper bounds given by Theorem 2.2, as 
functions of the parameter t. 

To evaluate the quality of the upper bound, the ratio between the em- 
pirical p-values and their upper bounds is shown in Figure 4 (c). The same 
ratio is also plotted for more dense graphs (30, 000 graphs sampled with the 
same number of nodes and classes and connection probabilities five times 
larger) and for graphs larger than the reference graphs but with comparable 
density (30, 000 graphs with 360 nodes, three classes of 120 nodes each and 
connection probabilities 0.01 and 0.0025). For the reference graphs, the ratio 
is about 1/10 for both patterns, with a minimum of 0.07 for the feed-forward 
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loop and 0.04 for the bi-fan. This ratio increases both for larger graphs and 
more dense graphs. 





Erdos 


ED 


MixNet 


BLOCKS 


A, 


4.3 e-46 




5.7 e-7 


1.3 e-6 


A 


5.2 e-14 


5.8 e-4 


8.2 e-6 


3.9 e-5 


<■ 


1.3 e-30 








,'?• 


3.2 e-14 


2.9 e-5 


3.1 e-8 


9.6 e-5 


K A 


(4.6 e-45) 


9.6 e-4 


(4.1 e-9) 


(1.2 e-6) 


? A 

A I 


(1.9 e-30) 




3.5 e-2 


3.8 e-2 


m 


1.6 e-4 









Table 1 

Local motifs found with different estimation procedures for the parameters of the random 

model. The table gives the upper-bound on the p-value computed by (2.5). The values 

under brackets show potential motifs which are filtered out as redundant with a smaller 

motif. 

4.2. Stability with respect to the random model estimation. The p-value 
used to decide if a pattern is a motif or not depends on the inferred con- 
nection matrix II. To evaluate the incidence of the choice of the inference 
method, the motif search is run for four distinct models: 

• the Erdos-Renyi model (Erdos and Renyi, 1959) with a connection 
probability such that the expected number of edges is equal to the 
observed one; 

• the Expected Degree (ED) model described in Matias et al.. That model 
draws a link between vertices u and v with a probability proportional 
to d u d v , where d u denotes the observed degree of the node u. An ade- 
quate choice of the normalisation constant leads to a model for which 
the expected degree of each node is almost equal to its observed one; 

• the mixture model MixNet in its Bayesian version (Latouche, Birmele and Ambroise, 
2008) implemented in the mixer R-package; 
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• the mixture model BLOCKS (Nowicki and Snijders, 2001) available in 
the STOCNET software (http://stat.gamma.rug.nl/stocnet/). 

As BLOCKS only supports graphs up to 200 nodes, we consider a ran- 
domly chosen subnetwork of the transcriptional Yeast regulation network 
containing 194 nodes. Inference of the matrix II and search for all local 
motifs of size 3 and 4 is done on that subnetwork. Table 1 shows all the 
motifs found using a threshold of 0.05 on the p-value with at least one of 
the method. One can see that the list of motifs remains stable. 

The only method leading to significantly different p-values is the one re- 
lying on the Erdos-Renyi model, which is known to poorly describe real 
networks. 

On the other hand, the method relying on the Expected Degree model is 
the one selecting the smallest number of motifs. The main difference with 
the two last methods is that the top-motif for Mixnet and BLOCKS is not 
a motif for ED. However, large themes for the bi-fan, which is detected as 
a motif by ED, correspond also to large themes of the first motif of size 3. 
Thus, the themes pointed out by the three methods are the same ones. 

Finally, the two methods taking into account both the degree-distribution 
and the group structure of networks, that is MixNet and BLOCKS, select 
the same motifs and with comparable p-values. 

4.3. Real networks. In order to point out the difference between global 
motifs and local ones, we run our procedure to determine all local motifs of 
size 3, 4 and 5 in two standard networks, both studied in Milo et al., and 
publicly available at http://weizmann.ac.il/mcb/UriAlon. Those networks 
are the transcriptional regulatory networks of Yeast and the electronic circuit 
s420 of the ISCAS89 benchmark. 

Inference of the parameters of the model is done using the Bayesian 
MixNet approach in order to be able to tackle patterns of size 5 in those 
graphs. However, that choice implies that hubs may be grouped in the same 
class, which is relevant from a mixture model point of view but may generate 
quite inhomogeneous groups in terms of degrees and thus local motifs with 
poor biological interpretation. For example, the parameter inference on the 
Yeast network gives rise to a group of two hubs of respective out-degrees 71 
and 44. Hence, the expected outdegree in that group is 57.5 and any star 
pattern will be selected as a local motif when centering the position on the 
largest hub. 

To avoid that phenomenon, we first run the whole procedure with the 
Bayesian MixNet approach and run it again with the Expected Degree ap- 
proach when the position of the theme leading to a local motif shows that 
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Local motif 
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• * 


A 
• -• 


\ J" 
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A 




p-value bound 


2.0 e-16 


2.3 e-9 


4.6 e-4 


8.6 e-4 


NZ 


38 


15 


5 


3 



Table 2 
Local motifs of size 3 found in the Yeast regulatory network. 



The algorithm is run on the Yeast regulatory network with a threshold of 
le — 3. Table 2 shows the local motifs of size 3 found in the network, the row 
Ny- denoting the order of the theme at the position U where Ajj is maximal. 

There are clearly two top local motifs of size 3. 

The first top local motif corresponds to a pair of regulators co-regulating 
a gene. This motif is not selected by the methods of global motif detection 
of Milo et al, Berg and Lassig or Wernicke and Rasche. However, all those 
methods select the motif of size 4 called bi-fan (see Figure 1). That global 
over-representation of the bi-fan is a consequence of the the local motif we 
detect. Indeed, the three largest themes of our local motif are of respective 
orders 38, 32 and 18. Thus, their presence imply ( 3 2 8 ) + ( 3 2 2 ) + ( 1 2 8 ) = 1352 
occurrences of the bi-fan. As the total number of occurrences of the bi-fan 
in the network is 1843, it gives confirmation on the local character of the 
over-representation of that pattern. 

The second one is the feed-forward loop with respect to the deletion of the 
vertex in-degree 2. A feed- forward is composed by a main regulator X and 
a gene Y regulated by X, both co-regulating a third gene Z. This pattern 
is a local motif with respect to the subpattern obtained by deleting Z. This 
indicates the existence of places in the network where a main regulator X 
and a gene Y regulated by X both co-regulate a high number of genes 
Z±,...,Zk- The value of Ny indicates that there is at least such a theme of 
order 38. That phenomenon is already described by U. Alon (Alon, 2007), 
under the denomination Multi- output feed-forward loops. 

The feed-forward loop is also found to be a local motif with respect to 
the deletion of the main regulator X. However, the p-value upper bound 
indicates that the order of the corresponding themes is less significative 
than in the previous case. This fact is confirmed by the lower value of Njj. It 
illustrates a main point of our method which is to differentiate the behaviour 
of the different deletion classes of a pattern. 

Finally, the feed-forward loop is not a local motif with respect to the 
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deletion of the intermediate gene Y, indicating that no main regulator X 
regulates a high number of genes Y\ . . . Y/. in order to regulate a gene Z. 



Local motif 


m • — ■-*• 


















p-value bound 


6.5 e-15 


3.4 e-6 


1.4 e-4 


5.6 e-4 


9.2 e-4 


Nu 


7 


2 


2 


1 


1 



Table 3 
Local motifs found of size 4 in the Yeast regulatory network. 

The method finds five local motifs of size 4 in the Yeast regulatory net- 
work. They are shown in Table 3. 

The first motif appearing in the list is of interest as it corresponds to three 
regulators co-regulating seven genes with an additional regulation between 
two of them. Another way to see the theme of that motif is a multi-output 
feed forward loop of order 7 with a third regulator acting on Z±, . . . , Zf. 
That motif is also found by the global methods but with a higher p-value 
and at the fifth or sixth position among the motifs of size 4. 

The 4 other local motifs have a value of Nfr lower than 2, suggesting that 
they appear in the list because of a very low expected value. Note that the 
bi-fan does not appear in the list as it is filtered out as a redundancy of the 
second motif of size 3. 

Finally, there is only one local motif of size 5. It corresponds to the fourth 
motif of size 4 with a supplementary edge going out from the squared vertex. 
Its value for Ny is 1, showing that its expected is low, such that it becomes 
over-represented even at its first occurence. 

The second network, that is the electronic circuit, has no local motif of 
size 3, 4 or 5 relying on a threshold of .01. However, one global motif of size 3 
and two global motifs of size 4 were found in Milo et al. with Z-scores larger 
than 10. This indicates a distinct behaviour of the two types of networks, 
the occurrences of the global motifs of the electronic network being spread 
in the whole networks rather then agglomerated. 

5. Conclusion. In this work, we propose a new approach to study net- 
work motifs, that is to look for locally over-represented patterns. Our frame- 
work allows us to take into account the over-representation of a pattern with 
respect to its subpatterns, for any pattern size. To list the local motifs of 
a network, we use a model-driven approach to determine a p-value upper 
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bound for each pair (pattern, subpattern) and then apply a filtering proce- 
dure to eliminate redundancy. 

Simulated data show that the error made by taking an upper-bound of 
the exact p-value is reasonable. The application of our method on standard 
real data allows us to find information on the role of the vertices of the 
motifs only by statistical means. Moreover, comparing the lists of local and 
global motifs highlights a strong structural difference between networks of 
different nature. In future work, we will investigate non standard data and 
a deeper understanding of the local motifs which are not global ones. 
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APPENDIX A: LOCAL UPPER BOUND 

To prove Inequality (2.2), we start from Inequality (2.1) which corre- 
sponds to Theorem 2.R in Barbour, Hoist and Janson and apply it for K = 
\\u(l + t)] ■ Then, A v > t if and only if N v > K. 

For all k > K, let u k = %e~ Xu . Then Po(\u)([K, +oo)) = Y.k>K u k and 
VA:> K, 2*±i < j^r. 



Thus, using that K\ > y/2irK(§) K , we get 



1 X§ 



Po(Xu)([K, +oo)) < YT^K\ e 



l+i 

* ,1 \K p K 
< t+i A U e p -\u 

t ^/2^KK K 
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Then, using Inequality (2.1), 

F(A V > t\G[U) ~ m') < \ , + = 7 t^ ^TwT-TTe U 

K ' l J ^ " * + 1 V27rAi;(l + t) (Ac/(1 + t)) A ^( 1+ *) 

< l e -A g ((l+t)log(l+t)-t) 

V27rA[/(l + t) 

Writing that 

P(A a > i) = P(A a > i|G[C7] ~ m')P(G[C/] ~ m') 
yields Inequality (2.2). 

APPENDIX B: LOWER BOUND 

B.l. Local lower bound. The first step is to find the best possible 
bound for the difference between the tail probability of a sum of independent 
random variables and the tail probability of the corresponding Poisson ap- 
proximation. This problem is presented and studied in Barbour, Hoist and Janson 
(1992). We use Theorem 9.D presented in that book, namely 

Theorem B.l (Barbour, Hoist, Janson). Define W = Y^i^i, where Xi 
are independant random variables. Set A = X^i^(^) an< ^ ^2 = Sj^(^) 2 - 

Let K > X be an integer, £ = A2/A and F = {K—X)/y\. Then, uniformly 
in K satisfying Y>1, K < A/2£ and 1 + AT 2 < (16*;)- 1 , we have 

F(W >K) = Po(X)([K, +oo))(l + 0(f) + O^T 2 )). 

Applying this result in our context for W = Njj and K = [A{/(1 + 1)~\ for 
a fixed t may not be possible because in this case T = t\/X~jj and thus the 
condition T > 1 may not be satisfied when Xjj is too small with respect to 
t. 

However, the proof of Theorem B.l uses the assumption T > 1 only once. 
Rewriting it without that condition until that step yields: 

¥{W>K) 

-(1+771X1 + 772) 



Po(X)([K,+oo)) 



with 



(B.l) M < (1 + 2rA^/ 2 ) 2 2A 2 /(K - A) 

and 
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(B.2) \ m \ < J2 Po(X)(r)\e r \/Po(X)([K,+oo)) 

r>K 

where |e r | < 8£r 2 + 2(r - K^FX' 1 / 2 . 

The hypothesis T > 1 is then used to bound the right hand side of In- 
equality (B.2), which can be alternatively bounded by 

|_, / o.T.2 , o*™-l/2V-,_ ^ ^(A)({r}) 

I I £ I 



CJ<,1 


~r ai^i- /\ 


r>K 


' "'Po(\){[K, 


+oo)} 


8^r 2 


+ 2iTX~ 


1/2 E • 

r>K 


Po(X)({r}) 
Po(X)({K}) 




8^r 2 


+ 2^rA" 


1/2 E • 

r>K 






8^r 2 


+ 2^rA- 


-1/2 1 


1 TS ' 





1-X/K' 

the last inequality deriving from the fact that the ratio between two con- 
secutive terms of the sum is always lower than X/K. 

Using the additional condition of Proposition 3.1, that is K > 2A[/, and 
the fact that it implies TA" 1 ' 2 > 1 allows us, using elementary bounds, to 
obtain from Inequalities (B.l) and (B.2) that 



(B.3) M<36£r 2 and M < 16£r 2 . 

As £T 2 < ^, we have 

F(W >K)> Po(X){[K, +oo)}(l - 52 



KX 2 . 



A 2 r 

The second part of the right hand-side of Proposition 3.1 comes from the 
asymptotic comparison between the tail probability Po(X){[Xu(l + t), +oo)} 
and its exponential approximation used in Theorem 2.1. It is derived in a 
very similar way than in Appendix A, using the following lower bound on 
K\, which can be proved from its asymptotic expansion 
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B.2. Global lower bound. 

Lemma B.l. Let p = maxjjlljj and suppose that p = 0{n~^~ e ), with 
e > 7^;. Let 5 = min(e,2ke — 1) > 0. Then 

P(£*) = (1 - rf) V P(.E[r), w/iere 7/ = ©(n" 5 ). 

[7 

Proof: We start from the simplest known lower bound for the probability 
of an union of events, that is: 

(b.4) p(jb*) > e n&u) - \ E E p (^ n ^) 

Given two positions [/ and V, let K\j and ify be such that Ejj = 
{Nu(m) > Ku} and E l v = {N v {m) > K v }. Moreover, let ext(U) be the 
set of vertices yielding to extensions of m' on U and i = \V\U\. Let us also 
recall that k is the size of the pattern m. 

Let S be any set of vertices not intersecting U. We decompose Ey as the 
union of the sets {T C ext(V)} for all sets T of Ky vertices. Thus 

F{E v \ext(U) = S) < E W{T C ext{V)\ext{U) = S) 

T,\T\=K V 
\S\ 

- E E f(TCext(V)\ext(U)=S). 

j=0 \T\=K v ,\TnS\=j 

Let T be such that \T n S\ = j. As m' is connected, at least i edges need 
to be present in V \ U to ensure that G [V] ~ m'. Moreover, to ensure that 
T C ext(V), all the edges between V \ U and T and all the edges between 
U DV and T \ S have to be present, which amounts to a total of at least 
iKy + (k — i){Ky — j) edges. 

Therefore, denoting by p the largest coefficient of the matrix II, P(T C 
ext{V)\ext{U) = S) < p * p iK v+(k^)(K v -j). 

We fix some e > and distinguish between two different cases whether 
the cardinality of S is larger or smaller than n e . 
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• If \S\ < n e , then 



min(\S\,Ky) 

i+iK v +(k-i)(K v -j) 



P(^M(^ S ) < g ( A ,"_i)('"> 

< f*Kv n Kv(K v + 1) max(l, {-tj-gf") 



np n 
(B.5) < p^Kv + ^max^p^p^Sl)^ 

It is straightforward to check that, for large enough n, we have np k < - 
and p^S"! < -. Therefore, the right hand side of Inequality (B.5) is a 
decreasing function of Ky and 

¥(E v \ext(U) = S)< 2 max (np i+k , p 2i \S\) 

Moreover, \-{i+k){\+e) < -i-(2ke-l) for e < \ and -2i{\+e)+e = 
—i — e. Thus, as p < Cn~2~ e for some constant C > 1, and | *S f | < n e , 

(B.6) P(^|ext(*7) = 5) < C^n-^- 5 

• For \S\ > n e , we roughly bound F(E v \ext(U) = S) by 1. However, let 
us note that g(\u,n e ) > n e for large enough n and therefore 

(B.7) Yj ^(ext(U) = S) = ¥(Nu > n e ) 

S,\S\>n£ 

= P( 5 (A C7 ,^)> 5 (A {/ ,n e )) 

< F(g(\u, Nu)>n e ) 

(B.8) < ¥(E\j)e~ nt+t by Theorem 2.2 

(B.9) < F(E t u )n~ l ~ 5 for large enough n. 

Using Inequalities (B.2) and (B.9), we get 
¥{E\j n E v ) = Yl W(E v \ext{U) = S)F(ext{U) = S) 

S;\S\>K V 

< Yj C^n'^FiextiU) = S) + n -i- *P(££r) 

_ftTc<|5|<n £ 

< {C 2k + l)n- i - & ¥{E t u ). 
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As the number of positions V such that \V \ U\ = i is bounded by n l , we 
finally obtain 



n^) - ^ n(c +1)n 

(B.10) < /k(C 2fc + l)n- 5 . 

Inequalities (B.4) and (B.10) yield the lemma. 
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